Harnessing Twitter "Big Data" for Automatic Emotion Identification

نویسندگان

  • Wenbo Wang
  • Lu Chen
  • Krishnaprasad Thirunarayan
  • Amit P. Sheth
چکیده

User generated content on Twitter (produced at an enormous rate of 340 million tweets per day) provides a rich source for gleaning people’s emotions, which is necessary for deeper understanding of people’s behaviors and actions. Extant studies on emotion identification lack comprehensive coverage of “emotional situations” because they use relatively small training datasets. To overcome this bottleneck, we have automatically created a large emotion-labeled dataset (of about 2.5 million tweets) by harnessing emotion-related hashtags available in the tweets. We have applied two different machine learning algorithms for emotion identification, to study the effectiveness of various feature combinations as well as the effect of the size of the training data on the emotion identification task. Our experiments demonstrate that a combination of unigrams, bigrams, sentiment/emotionbearing words, and parts-of-speech information is most effective for gleaning emotions. The highest accuracy (65.57%) is achieved with a training data containing about 2 million tweets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Meta-level sentiment models for big social data analysis

People react to events, topics and entities by expressing their personal opinions and emotions. These reactions can correspond to a wide range of intensities, from very mild to strong. An adequate processing and understanding of these expressions has been the subject of research in several fields, such as business and politics. In this context, Twitter sentiment analysis, which is the task of a...

متن کامل

Discovering Emotions in the Wild: An Inductive Method to Identify Fine-grained Emotion Categories in Tweets

This paper describes a method to expose a set of categories that are representative of the emotions expressed on Twitter inductively from data. The method can be used to expand the range of emotions that automatic classifiers can detect through the identification of fine-grained emotion categories human annotators are capable of detecting in tweets. The inter-annotator reliability statistics fo...

متن کامل

2016 Olympic Games on Twitter: Sentiment Analysis of Sports Fans Tweets using Big Data Framework

Big data analytics is one of the most important subjects in computer science. Today, due to the increasing expansion of Web technology, a large amount of data is available to researchers. Extracting information from these data is one of the requirements for many organizations and business centers. In recent years, the massive amount of Twitter's social networking data has become a platform for ...

متن کامل

Design and Test of the Real-time Text mining dashboard for Twitter

One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...

متن کامل

Harnessing Hadoop: Understanding the Big Data Processing Options for Optimizing Analytical Workloads

Asserting that data is vital to business is an understatement. Organizations have generated more and more data for years, but struggle to use it effectively. Clearly data has more important uses than ensuring compliance with regulatory requirements. In addition, data is being generated with greater velocity, due to the advent of new pervasive devices (e.g., smartphones, tablets, etc.), social W...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012